704 research outputs found
Sharp thresholds for high-dimensional and noisy recovery of sparsity
The problem of consistently estimating the sparsity pattern of a vector
\betastar \in \real^\mdim based on observations contaminated by noise arises
in various contexts, including subset selection in regression, structure
estimation in graphical models, sparse approximation, and signal denoising. We
analyze the behavior of -constrained quadratic programming (QP), also
referred to as the Lasso, for recovering the sparsity pattern. Our main result
is to establish a sharp relation between the problem dimension \mdim, the
number \spindex of non-zero elements in \betastar, and the number of
observations \numobs that are required for reliable recovery. For a broad
class of Gaussian ensembles satisfying mutual incoherence conditions, we
establish existence and compute explicit values of thresholds \ThreshLow and
\ThreshUp with the following properties: for any , if \numobs
> 2 (\ThreshUp + \epsilon) \log (\mdim - \spindex) + \spindex + 1, then the
Lasso succeeds in recovering the sparsity pattern with probability converging
to one for large problems, whereas for \numobs < 2 (\ThreshLow - \epsilon)
\log (\mdim - \spindex) + \spindex + 1, then the probability of successful
recovery converges to zero. For the special case of the uniform Gaussian
ensemble, we show that \ThreshLow = \ThreshUp = 1, so that the threshold is
sharp and exactly determined.Comment: Appeared as Technical Report 708, Department of Statistics, UC
Berkele
Randomized Sketches of Convex Programs with Sharp Guarantees
Random projection (RP) is a classical technique for reducing storage and
computational costs. We analyze RP-based approximations of convex programs, in
which the original optimization problem is approximated by the solution of a
lower-dimensional problem. Such dimensionality reduction is essential in
computation-limited settings, since the complexity of general convex
programming can be quite high (e.g., cubic for quadratic programs, and
substantially higher for semidefinite programs). In addition to computational
savings, random projection is also useful for reducing memory usage, and has
useful properties for privacy-sensitive optimization. We prove that the
approximation ratio of this procedure can be bounded in terms of the geometry
of constraint set. For a broad class of random projections, including those
based on various sub-Gaussian distributions as well as randomized Hadamard and
Fourier transforms, the data matrix defining the cost function can be projected
down to the statistical dimension of the tangent cone of the constraints at the
original solution, which is often substantially smaller than the original
dimension. We illustrate consequences of our theory for various cases,
including unconstrained and -constrained least squares, support vector
machines, low-rank matrix estimation, and discuss implications on
privacy-sensitive optimization and some connections with de-noising and
compressed sensing
Restricted strong convexity and weighted matrix completion: Optimal bounds with noise
We consider the matrix completion problem under a form of row/column weighted
entrywise sampling, including the case of uniform entrywise sampling as a
special case. We analyze the associated random observation operator, and prove
that with high probability, it satisfies a form of restricted strong convexity
with respect to weighted Frobenius norm. Using this property, we obtain as
corollaries a number of error bounds on matrix completion in the weighted
Frobenius norm under noisy sampling and for both exact and near low-rank
matrices. Our results are based on measures of the "spikiness" and
"low-rankness" of matrices that are less restrictive than the incoherence
conditions imposed in previous work. Our technique involves an -estimator
that includes controls on both the rank and spikiness of the solution, and we
establish non-asymptotic error bounds in weighted Frobenius norm for recovering
matrices lying with -"balls" of bounded spikiness. Using
information-theoretic methods, we show that no algorithm can achieve better
estimates (up to a logarithmic factor) over these same sets, showing that our
conditions on matrices and associated rates are essentially optimal
- β¦